Lab 3: Principal Components Analysis (PCA) Stat 154, Spring 2018 Introduction The goal of this lab is to go over the various options and steps required to perform a Principal Components Analysis (PCA). You will also learn about the functions prcomp() and princomp(), and how to use their outputs to answer questions like: • How many principal components to retain. • How to visualize the observations. • How to visualize the relationships among variables. • How to visualize supplementary variables.

Dataset NBA Teams In this lab we are going to use the data set about NBA teams, containing statistics per game during the regular season 2016-2017. The corresponding CSV file is available in the data/ folder of the github repo: https://github.com/ucb-stat154/stat154-spring-2018/tree/master/data

Your turn Create a new data frame dat that contains the following columns: • wins • losses • points • field_goals • points3 • free_throws • off_rebounds • def_rebounds • assists • steals • blocks • personal_fouls

repo<-'https://github.com/ucb-stat154/stat154-spring-2018/'
csv_file<-'raw/master/data/nba-teams-2017.csv'
url<-paste0(repo,csv_file)
download.file(url,destfile='nba-teams-2017.csv')
dataset<-read.csv('nba-teams-2017.csv',stringsAsFactors = FALSE)
str(dataset, vec.len = 1)
## 'data.frame':    30 obs. of  27 variables:
##  $ team                 : chr  "Golden State Warriors" ...
##  $ games_played         : int  82 82 ...
##  $ wins                 : int  67 61 ...
##  $ losses               : int  15 21 ...
##  $ win_prop             : num  0.817 0.744 ...
##  $ minutes              : num  48.2 48.3 ...
##  $ points               : num  116 ...
##  $ field_goals          : num  43.1 39.3 ...
##  $ field_goals_attempted: num  87.1 83.7 ...
##  $ field_goals_prop     : num  49.5 46.9 ...
##  $ points3              : num  12 9.2 ...
##  $ points3_attempted    : num  31.2 23.5 ...
##  $ points3_prop         : num  38.3 39.1 ...
##  $ free_throws          : num  17.8 17.6 ...
##  $ free_throws_att      : num  22.6 22 ...
##  $ free_throws_prop     : num  78.8 79.7 ...
##  $ off_rebounds         : num  9.4 10 ...
##  $ def_rebounds         : num  35 33.9 ...
##  $ rebounds             : num  44.4 43.9 ...
##  $ assists              : num  30.4 23.8 ...
##  $ turnovers            : num  14.8 13.4 ...
##  $ steals               : num  9.6 8 ...
##  $ blocks               : num  6.8 5.9 ...
##  $ block_fga            : num  3.8 4.1 ...
##  $ personal_fouls       : num  19.3 18.3 ...
##  $ personal_fouls_drawn : num  19.4 19.8 ...
##  $ plus_minus           : num  11.6 7.2 ...
dat<-subset(dataset,select=c('wins','losses','points','field_goals','points3','free_throws','off_rebounds','def_rebounds','assists','steals','blocks','personal_fouls'))
print(dat)
##    wins losses points field_goals points3 free_throws off_rebounds
## 1    67     15  115.9        43.1    12.0        17.8          9.4
## 2    61     21  105.3        39.3     9.2        17.6         10.0
## 3    55     27  115.3        40.3    14.4        20.3         10.9
## 4    53     29  108.0        38.6    12.0        18.7          9.1
## 5    51     31  100.7        37.0     9.6        17.1          9.4
## 6    51     31  106.9        39.2     8.8        19.7         10.6
## 7    51     31  110.3        39.9    13.0        17.5          9.3
## 8    51     31  108.7        39.5    10.3        19.3          9.0
## 9    49     33  109.2        41.3     9.2        17.3         10.3
## 10   47     35  106.6        39.5     8.4        19.2         12.2
## 11   43     39  100.5        36.4     9.4        18.3         10.8
## 12   43     39  103.2        38.1     8.9        18.1         10.3
## 13   42     40  105.1        39.3     8.6        17.9          9.0
## 14   42     40  103.6        38.8     8.8        17.2          8.8
## 15   41     41  102.9        38.6     7.6        18.0         12.2
## 16   41     41  107.9        39.5    10.4        18.5         10.1
## 17   41     41  103.2        39.0     9.9        15.2         10.6
## 18   40     42  111.7        41.2    10.6        18.7         11.8
## 19   37     45  101.3        39.9     7.7        13.9         11.1
## 20   36     46  104.9        37.7    10.0        19.4          8.8
## 21   34     48  104.3        39.1     9.4        16.7          8.6
## 22   33     49   97.9        36.2    10.7        14.8          7.9
## 23   32     50  102.8        37.9     9.0        18.1          8.7
## 24   31     51  105.6        39.5     7.3        19.3         11.4
## 25   31     51  104.3        39.6     8.6        16.6         12.0
## 26   29     53  101.1        38.3     8.5        16.0          9.8
## 27   28     54  102.4        37.7    10.1        17.0          9.8
## 28   26     56  104.6        39.3     8.9        17.0         11.4
## 29   24     58  107.7        39.9     7.5        20.4         11.9
## 30   20     62  105.8        37.8    10.7        19.4          8.8
##    def_rebounds assists steals blocks personal_fouls
## 1          35.0    30.4    9.6    6.8           19.3
## 2          33.9    23.8    8.0    5.9           18.3
## 3          33.5    25.2    8.2    4.3           19.9
## 4          32.9    25.2    7.5    4.1           20.6
## 5          33.8    20.1    6.7    5.0           18.8
## 6          32.6    18.5    8.3    4.9           20.8
## 7          34.4    22.7    6.6    4.0           18.1
## 8          34.0    22.5    7.5    4.2           19.8
## 9          32.6    23.9    8.5    4.1           21.3
## 10         34.4    21.0    7.9    5.0           20.9
## 11         32.0    21.3    8.0    4.2           22.4
## 12         34.1    23.6    8.2    4.8           18.2
## 13         33.0    22.5    8.2    5.0           19.5
## 14         31.6    24.2    8.1    5.3           20.2
## 15         34.1    22.6    7.8    4.8           17.7
## 16         33.5    21.1    7.0    5.0           21.2
## 17         33.0    21.2    7.2    5.7           20.5
## 18         34.6    25.3    6.9    3.9           19.1
## 19         34.6    21.1    7.0    3.8           17.9
## 20         34.8    23.1    7.0    4.8           16.6
## 21         35.1    22.8    7.8    5.5           18.2
## 22         30.7    20.8    7.5    3.7           19.1
## 23         32.3    22.5    7.6    4.0           20.3
## 24         31.0    23.7    8.0    4.5           20.1
## 25         33.2    21.8    7.1    5.5           20.3
## 26         33.3    22.2    7.1    4.8           19.3
## 27         33.0    23.8    8.4    5.1           21.9
## 28         32.1    20.9    8.2    3.9           20.7
## 29         33.1    19.6    8.2    4.9           24.8
## 30         35.1    21.4    7.2    4.7           21.0

Spend some time examining things like: • descriptive statistics with summary().

summary(dat)
##       wins           losses          points       field_goals   
##  Min.   :20.00   Min.   :15.00   Min.   : 97.9   Min.   :36.20  
##  1st Qu.:32.25   1st Qu.:31.50   1st Qu.:103.0   1st Qu.:38.15  
##  Median :41.00   Median :41.00   Median :105.0   Median :39.25  
##  Mean   :41.00   Mean   :41.00   Mean   :105.6   Mean   :39.05  
##  3rd Qu.:50.50   3rd Qu.:49.75   3rd Qu.:107.8   3rd Qu.:39.58  
##  Max.   :67.00   Max.   :62.00   Max.   :115.9   Max.   :43.10  
##     points3       free_throws     off_rebounds     def_rebounds  
##  Min.   : 7.30   Min.   :13.90   Min.   : 7.900   Min.   :30.70  
##  1st Qu.: 8.65   1st Qu.:17.02   1st Qu.: 9.025   1st Qu.:32.67  
##  Median : 9.30   Median :17.95   Median :10.050   Median :33.40  
##  Mean   : 9.65   Mean   :17.83   Mean   :10.133   Mean   :33.38  
##  3rd Qu.:10.38   3rd Qu.:19.07   3rd Qu.:11.050   3rd Qu.:34.33  
##  Max.   :14.40   Max.   :20.40   Max.   :12.200   Max.   :35.10  
##     assists          steals          blocks      personal_fouls 
##  Min.   :18.50   Min.   :6.600   Min.   :3.700   Min.   :16.60  
##  1st Qu.:21.12   1st Qu.:7.125   1st Qu.:4.125   1st Qu.:18.88  
##  Median :22.50   Median :7.800   Median :4.800   Median :20.00  
##  Mean   :22.63   Mean   :7.710   Mean   :4.740   Mean   :19.89  
##  3rd Qu.:23.77   3rd Qu.:8.200   3rd Qu.:5.000   3rd Qu.:20.77  
##  Max.   :30.40   Max.   :9.600   Max.   :6.800   Max.   :24.80

• univariate plots: boxplots, histograms, density curves.

boxplot(dat)

hist(dat$wins)
lines(density(dat$wins))

hist(dat$losses)
lines(density(dat$losses))

hist(dat$points)
lines(density(dat$points))

hist(dat$field_goals)
lines(density(dat$field_goals))

hist(dat$points3)
lines(density(dat$points3))

hist(dat$free_throws)
lines(density(dat$free_throws))

hist(dat$off_rebounds)
lines(density(dat$off_rebounds))

hist(dat$def_rebounds)
lines(density(dat$def_rebounds))

hist(dat$assists)
lines(density(dat$assists))

hist(dat$steals)
lines(density(dat$steals))

hist(dat$blocks)
lines(density(dat$blocks))

hist(dat$personal_fouls)
lines(density(dat$personal_fouls))

• compute the correlation matrix.

print(cor(dat))
##                       wins      losses      points field_goals     points3
## wins            1.00000000 -1.00000000  0.50752590  0.40747195  0.44681094
## losses         -1.00000000  1.00000000 -0.50752590 -0.40747195 -0.44681094
## points          0.50752590 -0.50752590  1.00000000  0.81520604  0.57537543
## field_goals     0.40747195 -0.40747195  0.81520604  1.00000000  0.16940382
## points3         0.44681094 -0.44681094  0.57537543  0.16940382  1.00000000
## free_throws     0.13913686 -0.13913686  0.55875989  0.16018996  0.17182294
## off_rebounds   -0.08560734  0.08560734  0.14922012  0.33705067 -0.37624968
## def_rebounds    0.21692340 -0.21692340  0.36992824  0.33683802  0.22765772
## assists         0.46268513 -0.46268513  0.57734309  0.52991280  0.46015123
## steals          0.27032897 -0.27032897  0.30982800  0.35777164 -0.05430218
## blocks          0.28131403 -0.28131403  0.15342092  0.27055358 -0.06847851
## personal_fouls -0.27270936  0.27270936  0.08680588  0.02144334 -0.12042193
##                free_throws off_rebounds def_rebounds     assists
## wins            0.13913686  -0.08560734   0.21692340  0.46268513
## losses         -0.13913686   0.08560734  -0.21692340 -0.46268513
## points          0.55875989   0.14922012   0.36992824  0.57734309
## field_goals     0.16018996   0.33705067   0.33683802  0.52991280
## points3         0.17182294  -0.37624968   0.22765772  0.46015123
## free_throws     1.00000000   0.17192870   0.12770466  0.08978386
## off_rebounds    0.17192870   1.00000000   0.02385551 -0.16448415
## def_rebounds    0.12770466   0.02385551   1.00000000  0.23022655
## assists         0.08978386  -0.16448415   0.23022655  1.00000000
## steals          0.23062295   0.07718630  -0.19672636  0.42982794
## blocks         -0.00406670  -0.02172908   0.30223031  0.30242149
## personal_fouls  0.31602643   0.27205446  -0.41289270 -0.26139863
##                     steals      blocks personal_fouls
## wins            0.27032897  0.28131403    -0.27270936
## losses         -0.27032897 -0.28131403     0.27270936
## points          0.30982800  0.15342092     0.08680588
## field_goals     0.35777164  0.27055358     0.02144334
## points3        -0.05430218 -0.06847851    -0.12042193
## free_throws     0.23062295 -0.00406670     0.31602643
## off_rebounds    0.07718630 -0.02172908     0.27205446
## def_rebounds   -0.19672636  0.30223031    -0.41289270
## assists         0.42982794  0.30242149    -0.26139863
## steals          1.00000000  0.37392387     0.31375577
## blocks          0.37392387  1.00000000    -0.02772306
## personal_fouls  0.31375577 -0.02772306     1.00000000

• get a scatterplot matrix with pairs()

pairs(dat)

Your turn As we saw in lecture, the minimal output of a PCA procedure should consists of eigenvalues, loadings, and principal components: Create the following objects: • eigenvalues: vector of eigenvalues (i.e. λ1, λ2, . . .)

pca_prcomp <- prcomp(dat, scale. = TRUE)
eigenvalues<-pca_prcomp$sdev^2
print(eigenvalues)
##  [1] 4.164615e+00 2.061621e+00 1.377660e+00 1.339038e+00 9.445148e-01
##  [6] 8.266370e-01 5.418803e-01 3.172210e-01 2.634804e-01 1.632044e-01
## [11] 1.273592e-04 6.167771e-33

• loadings: matrix of eigenvectors (i.e. V)

V<-pca_prcomp$rotation
print(V)
##                         PC1         PC2         PC3         PC4
## wins           -0.398075412  0.15598612  0.10745878 -0.15334568
## losses          0.398075412 -0.15598612 -0.10745878  0.15334568
## points         -0.419824132 -0.19938732 -0.30908139  0.07930596
## field_goals    -0.357358372 -0.23821878  0.04168063  0.29261717
## points3        -0.282642201  0.24236778 -0.43852168 -0.26438584
## free_throws    -0.167115047 -0.34510341 -0.42618447 -0.04954457
## off_rebounds    0.009704962 -0.44232712  0.01245106  0.45567381
## def_rebounds   -0.212971258  0.19999106 -0.10297431  0.60152383
## assists        -0.369015007  0.06739504  0.12123879 -0.10249238
## steals         -0.207867923 -0.36794870  0.39186478 -0.33667376
## blocks         -0.196357624 -0.05130561  0.55914063  0.13587072
## personal_fouls  0.088950346 -0.54661171 -0.11852570 -0.27733937
##                         PC5         PC6         PC7         PC8
## wins            0.468414493 -0.18979579  0.04365872 -0.04471559
## losses         -0.468414493  0.18979579 -0.04365872  0.04471559
## points         -0.095512938  0.11327669  0.07578747 -0.15149979
## field_goals     0.006748312  0.40970709  0.16943791 -0.45161451
## points3        -0.163151046  0.07974936  0.37045051  0.38506660
## free_throws    -0.097697769 -0.52878918 -0.48681293  0.03206656
## off_rebounds    0.442773172  0.13341250 -0.01438259  0.59025255
## def_rebounds   -0.266371336 -0.27622434  0.02534623 -0.12997347
## assists        -0.330000660  0.40087945 -0.29590090  0.41909728
## steals         -0.149511341  0.05253747 -0.29905033 -0.12759984
## blocks         -0.334421944 -0.43700019  0.32729470  0.24784686
## personal_fouls -0.075831806 -0.11293000  0.55004993 -0.03383988
##                        PC9         PC10          PC11          PC12
## wins            0.03092493  0.147801412 -0.0005032429  7.071068e-01
## losses         -0.03092493 -0.147801412  0.0005032429  7.071068e-01
## points         -0.17172943 -0.199970573 -0.7496947197 -2.747802e-15
## field_goals    -0.23247393 -0.069322551  0.5184251837  1.776357e-15
## points3         0.20677816 -0.388229297  0.2953012488  1.054712e-15
## free_throws    -0.24426074  0.008749499  0.2863049501  9.853229e-16
## off_rebounds    0.13875304 -0.121463456 -0.0010947897  2.081668e-17
## def_rebounds    0.56616194  0.238503116 -0.0001195162  1.249001e-16
## assists        -0.10854694  0.538040594 -0.0027449836  3.053113e-16
## steals          0.55994315 -0.331865117 -0.0001105887 -5.551115e-17
## blocks         -0.34632118 -0.190812412 -0.0027385340  0.000000e+00
## personal_fouls  0.16456848  0.503039241 -0.0017430072  5.551115e-17

• scores: matrix of principal components (i.e. Z = XV)

Z<-pca_prcomp$x
print(Z)
##            PC1         PC2        PC3         PC4          PC5         PC6
## 1  -7.11481922 -0.32475222  2.2905098 -0.34137621 -1.439762641  0.55967095
## 2  -2.14360862  0.97292384  1.8195990  0.08591542  0.862169896 -1.06432907
## 3  -3.86843431 -0.54512433 -2.3009107 -0.90366254  0.280477825  0.37529668
## 4  -1.55766480  0.74905508 -1.3461868 -1.66593211  0.325782424  0.14491866
## 5   0.91352401  2.18374798  0.2117517  0.05745074  1.166635570 -1.61695575
## 6  -0.28521066 -1.42113610  0.0763837 -0.59755356  1.482174192 -1.56463965
## 7  -1.69736079  2.23488894 -2.1612456  0.39672861  0.558873489  0.48670516
## 8  -1.30123596  0.52231238 -1.2228823 -0.35011719  0.387159395 -0.56072762
## 9  -1.43097081 -1.29564930  0.2210834 -0.75781908  0.778681084  1.42448842
## 10 -0.54648178 -1.52688188  0.1046333  1.25214874  1.062591040 -1.09772919
## 11  1.70208669 -0.87147140 -0.0982859 -1.74276279  1.089643984 -0.80186740
## 12 -0.11434088  0.53813869  0.7736702  0.25176282 -0.032864729 -0.39038642
## 13 -0.07617772  0.01529814  0.8697323 -0.55682931 -0.321647533 -0.20120494
## 14  0.19710891  0.04272970  1.4447019 -1.57614215 -0.425153814  0.23471436
## 15  0.45166980 -0.08074649  0.9137719  1.65808320  0.892597258 -0.18957926
## 16 -0.09477052 -0.29372442 -0.9534533  0.30653759 -0.002903886 -0.61308771
## 17  0.73274686  0.48408794  1.2000204  0.37582601  0.295813723  0.03415113
## 18 -1.55514793 -0.26225599 -1.9407432  1.91002123  0.177054600  1.45651783
## 19  1.68812268  1.48083835  0.6737650  2.22027091  1.207690640  1.70581745
## 20  0.21718653  2.05342377 -0.8576597  0.88526001 -1.128567884 -1.01172878
## 21  0.09852432  1.38211494  0.9610816  0.94561280 -1.727120433 -0.29377075
## 22  3.29018415  2.20511611  0.1090580 -2.65417910  0.228780523  1.05911971
## 23  1.76352352  0.22697261 -0.5008575 -1.17412293 -0.543573523  0.26926494
## 24  1.09720645 -2.07177935  0.3026713 -0.32079048  0.179893343  0.90005392
## 25  1.20967806 -0.76672601  0.7379663  1.60116378 -0.025882362  0.37837293
## 26  2.12261877  0.87179686  0.6038219  0.61698512 -0.574392528  0.49510854
## 27  1.25996525 -0.72251736  0.7001585 -1.02389729 -1.643494495  0.23030656
## 28  1.97563355 -1.54719449 -0.1739730 -0.05992243  0.144607196  1.42570137
## 29  1.51688073 -4.30908967 -0.6415354  0.54999932 -0.678984984 -0.78917196
## 30  1.54956375  0.07560368 -1.8166468  0.61134085 -2.576277370 -0.98503009
##             PC7          PC8         PC9        PC10          PC11
## 1   0.317942817 -0.006314034  0.22210644  0.02331291  0.0037710703
## 2  -0.125615582  0.142212594 -0.23968151  0.09193448  0.0125315307
## 3  -0.023658569  0.907390055  0.34176752 -0.83888840  0.0080870617
## 4   0.046643073  0.381040431 -0.06350358  0.78971692 -0.0147287675
## 5   0.530069736  0.155521649 -0.15190088  0.25732680  0.0015446228
## 6  -0.019896926 -0.887517524 -0.02787462 -0.85975141  0.0075756971
## 7   0.718655029 -0.236896830 -0.17013618 -0.29438019  0.0078716269
## 8  -0.263399020 -0.933902443 -0.04135255  0.37419491 -0.0101032434
## 9   0.089330324 -1.027046922  0.26446260  0.37317237 -0.0191405061
## 10 -0.007721053  0.106781817  0.48425539  0.14712607  0.0013738371
## 11  0.005430565  0.883926221  0.79133583  0.59401213  0.0013764401
## 12 -1.277488808  0.344752999  0.66458940 -0.02225418  0.0007750084
## 13 -0.445845558 -0.839300437 -0.21075099 -0.14375804  0.0011500550
## 14 -0.103185515 -0.102851617 -0.87147836  0.23538112 -0.0043804662
## 15 -1.379921807  0.712018440  0.30594061 -0.15569631 -0.0180557545
## 16  1.148446203 -0.151702097 -0.64902616  0.03858305  0.0033038023
## 17  1.888211492  0.526346303 -0.34829490 -0.23930014 -0.0263530159
## 18 -0.330084529  0.396140448 -0.34024113  0.47401019  0.0045067576
## 19  0.214101899 -0.734317529  0.81161157  0.28643531  0.0154369416
## 20 -1.396191203  0.128766662 -0.43561512 -0.24386990 -0.0090153168
## 21 -0.002524130 -0.604504095  0.33500906 -0.29998999 -0.0013776825
## 22  0.208465643  0.008833207  0.15124148 -0.59221748  0.0031297100
## 23 -0.634579536 -0.448495837 -0.23524701  0.38305438  0.0238269356
## 24 -1.403633024  0.336472538 -1.32636846 -0.07902744  0.0027204653
## 25  0.959809753  0.811830290 -0.71619970 -0.17556800  0.0127101497
## 26  0.210830667  0.106777477 -0.27879892  0.22557987 -0.0022390876
## 27  0.409438020  0.792303333  0.80756402  0.17722898  0.0133777349
## 28 -0.167583342 -0.102485014  0.47565321 -0.76146442 -0.0147198399
## 29  0.590401425 -0.504715760  0.04974661  0.19062664  0.0043104563
## 30  0.243551958 -0.161064323  0.40118632  0.04446976 -0.0092662233
##             PC12
## 1   1.632454e-15
## 2   8.659919e-16
## 3   5.348558e-16
## 4   5.762514e-16
## 5   2.688263e-16
## 6  -3.846643e-16
## 7   3.652996e-16
## 8   2.206574e-16
## 9   1.796518e-16
## 10  4.444802e-17
## 11  4.515247e-17
## 12  2.855369e-16
## 13 -9.364074e-17
## 14  8.295629e-17
## 15  2.613267e-17
## 16 -1.592937e-16
## 17 -1.423879e-16
## 18  3.062346e-16
## 19 -8.655010e-17
## 20 -2.681813e-17
## 21 -6.655394e-17
## 22 -5.867043e-16
## 23 -1.850838e-16
## 24 -4.803115e-16
## 25 -2.902691e-16
## 26 -2.309991e-16
## 27 -1.121738e-16
## 28 -8.560654e-16
## 29 -8.543717e-16
## 30 -3.928391e-16

Note: The signs of the columns of the loadings and scores are arbitrary, and so may differ between different programs for PCA, and even between different builds of R. Look around at the output of your neighbors to see who has similar results to yours, and who has different outputs. Quickly inspect the objects created above: • How many eigenvalues are almost zero (or zero)? 2 • What about the loading associated to the 12th PC? each value in 12th PC is alomst zero. • What about the 12th PC score? 12th PC score all close to zero. • Can you guess what’s going on with the values of the 12th dimension? the values of the 12th dimension all small and close to zero.

Your turn • Compare the results of prcomp() against those of princomp() in terms of eigenvalues,loadings, and PCs

pca_princomp <- princomp(dat, cor = TRUE)
eigenvalues2<-pca_princomp$sdev^2
print(eigenvalues)
##  [1] 4.164615e+00 2.061621e+00 1.377660e+00 1.339038e+00 9.445148e-01
##  [6] 8.266370e-01 5.418803e-01 3.172210e-01 2.634804e-01 1.632044e-01
## [11] 1.273592e-04 6.167771e-33
V<-pca_princomp$loadings
print(V)
## 
## Loadings:
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## wins            0.398  0.156  0.107 -0.153  0.468  0.190              
## losses         -0.398 -0.156 -0.107  0.153 -0.468 -0.190              
## points          0.420 -0.199 -0.309               -0.113         0.151
## field_goals     0.357 -0.238         0.293        -0.410  0.169  0.452
## points3         0.283  0.242 -0.439 -0.264 -0.163         0.370 -0.385
## free_throws     0.167 -0.345 -0.426                0.529 -0.487       
## off_rebounds          -0.442         0.456  0.443 -0.133        -0.590
## def_rebounds    0.213  0.200 -0.103  0.602 -0.266  0.276         0.130
## assists         0.369         0.121 -0.102 -0.330 -0.401 -0.296 -0.419
## steals          0.208 -0.368  0.392 -0.337 -0.150        -0.299  0.128
## blocks          0.196         0.559  0.136 -0.334  0.437  0.327 -0.248
## personal_fouls        -0.547 -0.119 -0.277         0.113  0.550       
##                Comp.9 Comp.10 Comp.11 Comp.12
## wins                   0.148           0.707 
## losses                -0.148           0.707 
## points          0.172 -0.200   0.750         
## field_goals     0.232         -0.518         
## points3        -0.207 -0.388  -0.295         
## free_throws     0.244         -0.286         
## off_rebounds   -0.139 -0.121                 
## def_rebounds   -0.566  0.239                 
## assists         0.109  0.538                 
## steals         -0.560 -0.332                 
## blocks          0.346 -0.191                 
## personal_fouls -0.165  0.503                 
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.083  0.083  0.083  0.083  0.083  0.083  0.083  0.083
## Cumulative Var  0.083  0.167  0.250  0.333  0.417  0.500  0.583  0.667
##                Comp.9 Comp.10 Comp.11 Comp.12
## SS loadings     1.000   1.000   1.000   1.000
## Proportion Var  0.083   0.083   0.083   0.083
## Cumulative Var  0.750   0.833   0.917   1.000
Z<-pca_princomp$scores
print(Z)
##         Comp.1      Comp.2      Comp.3      Comp.4       Comp.5
## 1   7.23644888 -0.33030394  2.32966665 -0.34721212 -1.464375751
## 2   2.18025415  0.98955622  1.85070553  0.08738417  0.876908911
## 3   3.93456619 -0.55444337 -2.34024534 -0.91911088  0.285272665
## 4   1.58429347  0.76186037 -1.36920017 -1.69441165  0.331351758
## 5  -0.92914094  2.22107971  0.21537162  0.05843287  1.186579503
## 6   0.29008641 -1.44543078  0.07768950 -0.60776889  1.507512339
## 7   1.72637760  2.27309493 -2.19819268  0.40351079  0.568427574
## 8   1.32348093  0.53124144 -1.24378781 -0.35610253  0.393777984
## 9   1.45543362 -1.31779875  0.22486284 -0.77077419  0.791992836
## 10  0.55582403 -1.55298432  0.10642207  1.27355454  1.080756306
## 11 -1.73118430 -0.88636943 -0.09996612 -1.77255576  1.108271726
## 12  0.11629556  0.54733831  0.78689628  0.25606677 -0.033426560
## 13  0.07748000  0.01555967  0.88460064 -0.56634845 -0.327146179
## 14 -0.20047854  0.04346017  1.46939945 -1.60308670 -0.432421927
## 15 -0.45939121 -0.08212687  0.92939308  1.68642856  0.907856437
## 16  0.09639065 -0.29874572 -0.96975280  0.31177793 -0.002953529
## 17 -0.74527335  0.49236355  1.22053509  0.38225086  0.300870734
## 18  1.58173358 -0.26673932 -1.97392066  1.94267353  0.180081394
## 19 -1.71698156  1.50615366  0.68528319  2.25822700  1.228336420
## 20 -0.22089939  2.08852757 -0.87232161  0.90039376 -1.147861041
## 21 -0.10020862  1.40574255  0.97751152  0.96177829 -1.756645998
## 22 -3.34643069  2.24281313  0.11092237 -2.69955297  0.232691584
## 23 -1.79367141  0.23085276 -0.50941975 -1.19419486 -0.552866051
## 24 -1.11596347 -2.10719695  0.30784553 -0.32627447  0.182968665
## 25 -1.23035781 -0.77983338  0.75058200  1.62853608 -0.026324827
## 26 -2.15890548  0.88670045  0.61414435  0.62753264 -0.584211915
## 27 -1.28150468 -0.73486898  0.71212786 -1.04140107 -1.671590453
## 28 -2.00940751 -1.57364418 -0.17694716 -0.06094682  0.147079293
## 29 -1.54281219 -4.38275466 -0.65250261  0.55940170 -0.690592406
## 30 -1.57605394  0.07689615 -1.84770288  0.62179188 -2.620319490
##         Comp.6       Comp.7       Comp.8      Comp.9     Comp.10
## 1  -0.56923867  0.323378131  0.006421975 -0.22590340  0.02371145
## 2   1.08252405 -0.127763013 -0.144643755  0.24377893  0.09350612
## 3  -0.38171247 -0.024063018 -0.922902119 -0.34761012 -0.85322941
## 4  -0.14739609  0.047440448 -0.387554415  0.06458919  0.80321734
## 5   1.64459802  0.539131413 -0.158180331  0.15449766  0.26172587
## 6   1.59138757 -0.020237069  0.902689863  0.02835115 -0.87444908
## 7  -0.49502550  0.730940620  0.240946641  0.17304470 -0.29941269
## 8   0.57031340 -0.267901893  0.949867744  0.04205948  0.38059187
## 9  -1.44884041  0.090857449  1.044604551 -0.26898366  0.37955184
## 10  1.11649515 -0.007853046 -0.108607279 -0.49253386  0.14964123
## 11  0.81557553  0.005523402 -0.899037165 -0.80486392  0.60416692
## 12  0.39706017 -1.299327806 -0.350646640 -0.67595073 -0.02263462
## 13  0.20464459 -0.453467402  0.853648493  0.21435383 -0.14621562
## 14 -0.23872686 -0.104949498  0.104609892  0.88637651  0.23940502
## 15  0.19282017 -1.403511923 -0.724190577 -0.31117074 -0.15835797
## 16  0.62356860  1.168079184  0.154295484  0.66012143  0.03924263
## 17 -0.03473496  1.920490950 -0.535344327  0.35424909 -0.24339104
## 18 -1.48141737 -0.335727408 -0.402912570  0.34605763  0.48211352
## 19 -1.73497883  0.217762026  0.746870875 -0.82548628  0.29133199
## 20  1.02902454 -1.420059448 -0.130967961  0.44306207 -0.24803892
## 21  0.29879284 -0.002567281  0.614838246 -0.34073613 -0.30511839
## 22 -1.07722563  0.212029416 -0.008984213 -0.15382699 -0.60234159
## 23 -0.27386809 -0.645427836  0.456162988  0.23926861  0.38960280
## 24 -0.91544057 -1.427628489 -0.342224622  1.34904307 -0.08037843
## 25 -0.38484132  0.976217946 -0.825708736  0.72844331 -0.17856938
## 26 -0.50357255  0.214434871 -0.108602865  0.28356506  0.22943621
## 27 -0.23424371  0.416437467 -0.805847960 -0.82136953  0.18025876
## 28 -1.45007410 -0.170448222  0.104237022 -0.48378462 -0.77448185
## 29  0.80266306  0.600494488  0.513344005 -0.05059704  0.19388545
## 30  1.00186943  0.247715541  0.163817759 -0.40804471  0.04522998
##          Comp.11       Comp.12
## 1  -0.0038355378 -5.942721e-15
## 2  -0.0127457604 -2.669569e-14
## 3  -0.0082253121 -1.615294e-14
## 4   0.0149805596  2.913457e-14
## 5  -0.0015710286 -5.920484e-15
## 6  -0.0077052056 -1.741749e-14
## 7  -0.0080061944 -1.635149e-14
## 8   0.0102759609  1.989617e-14
## 9   0.0194677179  3.843099e-14
## 10 -0.0013973232 -2.953595e-15
## 11 -0.0013999707 -3.594482e-15
## 12 -0.0007882574 -1.950876e-15
## 13 -0.0011697155 -2.491608e-15
## 14  0.0044553514  8.375594e-15
## 15  0.0183644222  3.446904e-14
## 16 -0.0033602816 -6.367749e-15
## 17  0.0268035274  5.008996e-14
## 18 -0.0045838018 -6.394542e-15
## 19 -0.0157008400 -3.050795e-14
## 20  0.0091694360  1.757720e-14
## 21  0.0014012343  3.347594e-15
## 22 -0.0031832132 -8.635171e-15
## 23 -0.0242342632 -4.577528e-14
## 24 -0.0027669724 -4.453825e-15
## 25 -0.0129274330 -2.473427e-14
## 26  0.0022773653  4.842052e-15
## 27 -0.0136064307 -2.451377e-14
## 28  0.0149714793  2.939253e-14
## 29 -0.0043841447 -5.344336e-15
## 30  0.0094246317  2.100339e-14

• If you carefully look at the princomp() loadings, you should notice that some values are left in blank. Why is this? Check the documentation ?princomp

A : Small loadings are conventionally not printed (replaced by spaces), to draw the eye to the pattern of the larger loadings.

Your turn What are the differences between prcomp() and princomp()? Spend some time reading the help documentation of both functions to find out the main differences between them. Are there any cases when it would be better to use one function or the other?

A : princomp() use evd on x to calculate the PCs. But a preferred method of calculation is to use svd on x, as is done in prcomp(). Princomp() only handles so-called R-mode PCA, that is feature extraction of variables. If a data matrix is supplied (possibly via a formula) it is required that there are at least as many units as variables. For Q-mode PCA use prcomp() function.

Your turn Compute a table containing the eigenvalues, the variance in terms of percentages, and the cumulative percentages, like the table below. Analysts typically look at a bar-chart of the eigenvalues (see figure below). Plot your own bar-chart.

per=vector(,length(eigenvalues))
cum_per=vector(,length(eigenvalues))
per[1]=eigenvalues[1]/sum(eigenvalues)*100
cum_per[1]=per[1]
for (i in 2:length(eigenvalues)) {
  per[i]=eigenvalues[i]/sum(eigenvalues)*100
  cum_per[i]=cum_per[i-1]+per[i]
}
data.frame(eigenvalue=eigenvalues,percentage=per,cumulative.percentage=cum_per)
barplot(eigenvalues)

• How much of the variation in the data is captured by the first PC? 4.164615 • How much of the variation in the data is captured by the second PC? 2.061621 • How much of the variation in the data is captured by the first two PCs? 6.21

Your turn • Calculate a matrix (ot table) of correlations between the variables and the PCs. In other words, what are the correlations of the variables with the 1st PC, with the 2nd PC, and so on.

cc<-cor(dat,pca_prcomp$x)
library(factoextra)
## Loading required package: ggplot2
## Welcome! Related Books: `Practical Guide To Cluster Analysis in R` at https://goo.gl/13EFCZ
fviz_pca_var(pca_prcomp, col.var = "black")

• What variables seem to be more correlated with PC1?

points

• What variables seem to be more correlated with PC2?

personal_fouls

Your turn Begin with a scatterplot of the first two PCs (see figure below).

library(plotly)
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
# data frame for plot_ly()
scores_df <- cbind.data.frame(
pca_prcomp$x,
team = dataset$team, stringsAsFactors = FALSE
)
# scatter plot
plot_ly(data = scores_df, x = ~PC1, y = ~PC2, type = 'scatter',
mode = 'markers',
text = ~team,
marker = list(size = 10))

• Also plot PC1 - PC3, and then plot PC2 - PC3. If you want, continue visualizing other scatterplots.

plot_ly(data = scores_df, x = ~PC1, y = ~PC3, type = 'scatter',
mode = 'markers',
text = ~team,
marker = list(size = 10))
plot_ly(data = scores_df, x = ~PC2, y = ~PC3, type = 'scatter',
mode = 'markers',
text = ~team,
marker = list(size = 10))

• What patterns do you see?

• Try adding numeric labels to the points to see which observations seem to be potential outliers.

# 3d scatter plot
plot_ly(data = scores_df, x = ~PC1, y = ~PC2, z = ~PC3, type = 'scatter3d',
mode = 'markers', text = ~team)

Your turn: Graph various biplot()’s with different values of scale (e.g. 0, 0.3, 0.5, 1). How do the relative positions of the arrows change with respect to the points? Under which scale value you find it easier to read the biplot?

biplot(pca_prcomp, scale = 0)

biplot(pca_prcomp, scale = 0.3)

biplot(pca_prcomp, scale = 0.5)

biplot(pca_prcomp, scale = 1)